-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NYS2AWS-134] second attempt transactions merger #70
Merged
pvriel
merged 21 commits into
version-0.x.x
from
NYS2AWS-134-second-attempt-transactions-merger
Feb 11, 2025
Merged
[NYS2AWS-134] second attempt transactions merger #70
pvriel
merged 21 commits into
version-0.x.x
from
NYS2AWS-134-second-attempt-transactions-merger
Feb 11, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
New plugin diagram: sequenceDiagram
box Purple Healthprocessor platform
participant ProcessorService
participant SolrUndersizedTransactionsHealthProcessorPlugin
end
box Darkblue SolrUndersizedTransactionsHealthProcessorPlugin internal workings
participant Shared thread pool
participant Worker thread
end
box Darkgreen Alfresco components (or Healthprocessor platform proxies)
participant AbstractNodeDAOImpl
participant TransactionHelper
end
par Receive new tasks
ProcessorService->>SolrUndersizedTransactionsHealthProcessorPlugin: .doProcess(set of nodeRefs)
SolrUndersizedTransactionsHealthProcessorPlugin->>SolrUndersizedTransactionsHealthProcessorPlugin: Update state.
opt Too many queued tasks
SolrUndersizedTransactionsHealthProcessorPlugin->>SolrUndersizedTransactionsHealthProcessorPlugin: Wait until a previous task has been handled.
end
SolrUndersizedTransactionsHealthProcessorPlugin->>Shared thread pool: Queue new task.
SolrUndersizedTransactionsHealthProcessorPlugin-->>ProcessorService: return healthy reports for all nodeRefs
and Process old tasks in the background
Shared thread pool-->>Worker thread: execute in separate thread
Worker thread->>AbstractNodeDAOImpl: Fetch node IDs for nodeRefs.
AbstractNodeDAOImpl-->>Worker thread: node IDs
Worker thread->>TransactionHelper: Start new transaction.
Worker thread->>TransactionHelper: .getCurrentTransactionId(...)
TransactionHelper-->>Worker thread: transaction ID
Worker thread->>AbstractNodeDAOImpl: .touchNodes(transaction ID, node IDs)
Worker thread->>TransactionHelper: finalize transaction.
Worker thread->>SolrUndersizedTransactionsHealthProcessorPlugin: Update state.
Worker thread->>SolrUndersizedTransactionsHealthProcessorPlugin: Notify about processed task.
end
|
New indexing diagram: sequenceDiagram
box Purple Healthprocessor platform
participant ProcessorService
participant ThresholdIndexingStrategy
end
box Darkblue ThresholdIndexingStrategy internal workings
participant Shared queue A
participant ThresholdIndexingStrategyTransactionIdFetcher
participant Shared queue B
participant ThresholdIndexingStrategyTransactionIdMerger
end
box Darkgreen Alfresco components
participant SearchTrackingComponent
participant dataSource (through JdbcTemplate)
end
ProcessorService->>ThresholdIndexingStrategy: .onStart()
ThresholdIndexingStrategy->>ThresholdIndexingStrategy: Initialize state & progress.
ThresholdIndexingStrategy->>ThresholdIndexingStrategyTransactionIdFetcher: .run() (in separate thread)
ThresholdIndexingStrategy->>ThresholdIndexingStrategyTransactionIdMerger: .run() (in seperate thread(s))
par Fetch TxnIDs
loop while not stopped
ThresholdIndexingStrategyTransactionIdFetcher->>dataSource (through JdbcTemplate): Fetch a preconfigured amount of transaction IDs for each worker.
dataSource (through JdbcTemplate)-->>ThresholdIndexingStrategyTransactionIdFetcher: transaction IDs
ThresholdIndexingStrategyTransactionIdFetcher->>ThresholdIndexingStrategyTransactionIdFetcher: Stop if no transaction IDs are received. Otherwise, divide the transaction IDs for the workers.
loop foreach worker
ThresholdIndexingStrategyTransactionIdFetcher->>Shared queue B: Queue batch of transaction IDs.
end
ThresholdIndexingStrategyTransactionIdFetcher->>ThresholdIndexingStrategyTransactionIdFetcher: Update state. Also, stop if amount of transactions != amount of workers * worker batch size.
end
loop foreach background worker
ThresholdIndexingStrategyTransactionIdFetcher->>Shared queue B: Queue a stop signal.
end
and Process TxnIDs
loop while not stopped
ThresholdIndexingStrategyTransactionIdMerger->>Shared queue B: Fetch next batch of transaction IDs.
Shared queue B-->>ThresholdIndexingStrategyTransactionIdMerger: next batch
ThresholdIndexingStrategyTransactionIdMerger->>ThresholdIndexingStrategyTransactionIdMerger: Stop if an end signal is received.
ThresholdIndexingStrategyTransactionIdMerger->>SearchTrackingComponent: Fetch nodes associated with transaction IDs.
SearchTrackingComponent-->>ThresholdIndexingStrategyTransactionIdMerger: nodes
loop foreach transaction
opt if the transaction size is not sufficiently large (e.g. has not been merged previously)
loop foreach node in transaction
ThresholdIndexingStrategyTransactionIdMerger->>ThresholdIndexingStrategyTransactionIdMerger: Add the ref. of the to a temporary cache / bucket.
opt bucket is full
ThresholdIndexingStrategyTransactionIdMerger->>ThresholdIndexingStrategyTransactionIdMerger: Create a copy of the bucket and clear the original one.
ThresholdIndexingStrategyTransactionIdMerger->>Shared queue A: Queue copy of bucket.
end
end
end
end
end
ThresholdIndexingStrategyTransactionIdMerger->>Shared queue A: Queue a stop signal.
and Fetch nodeRefs
ProcessorService->>ThresholdIndexingStrategy: .getNextNodeIds(ignored value)
loop while not all background workers have stopped working
ThresholdIndexingStrategy->>Shared queue A: Fetch next batch of NodeRefs.
Shared queue A-->>ThresholdIndexingStrategy: next batch
opt next batch is not a stop signal
ThresholdIndexingStrategy-->>ProcessorService: next batch
end
ThresholdIndexingStrategy->>ThresholdIndexingStrategy: Keep track of amount of stopped background workers.
opt all background workers have stopped working
ThresholdIndexingStrategy-->>ThresholdIndexingStrategy: break
end
end
ThresholdIndexingStrategy-->>ProcessorService: empty batch
end
ProcessorService->>ThresholdIndexingStrategy: .onStop()
ThresholdIndexingStrategy->>ThresholdIndexingStrategy: Reset state and progress.
ThresholdIndexingStrategy->>ThresholdIndexingStrategyTransactionIdFetcher: .interrupt() (single thread)
ThresholdIndexingStrategy->>ThresholdIndexingStrategyTransactionIdMerger: .interrupt() (all threads)
|
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://xenitsupport.jira.com/browse/NYS2AWS-134
Besides fetching the transaction IDs (because we simply did not consider it), most mechanisms have been parallelized to improve the performance. Now, the health processor platform can process ~ 3300 transactions / second instead of just 200.
Please note that Sonar is complaining about InterruptExceptions that are not being thrown. This is impossible, since the code is executed in the .run() method of a Runnable instance. Technically speaking, you can use @ SneakyThrows (and I considered it), but I think it's better to clearly indicate & handle the exceptions (as much as possible) that can be thrown.
Sonar is also complaining about the fact that the
ToggleableHealthProcessorPlugin()
constructor is public, while this is a required.Sonar is also complaining about the fact that I'm implementing the
mergeTransactions
method with a Runnable instance (using the functional interface mechanism) instead of passing a lambda (seriously?), however, if you do that, the unit tests can't properly mock the lambda.Explanation of how the indexing strategy works:
Explanation for the healthprocessor plugin: